It is also possible that a sufficiently well thought out associative memory system could 
equal deep neural networks. The point that struck me in the Memory Mosaics paper 
was the conditional expectation aspect. A very simple associative memory would be 
E(x|b=0) and E(x|b=1) where b is a bit from a locality sensitive hash. And funnily enough 
that works. It's produces a crude but rather general result. For a LSH based on random 
projections (eg. an input vector multiplied by a matrix filled with random numbers) you 
are just creating a random decision plane through the entire input space and working 
out an average response value (x) depending on whether the input vector is on one side 
of the decision plane or the other. You can extend the idea to bit strings, say cde, then 
you could have E(x|cde=101) which is more specific and that results in going from the 
more general to the more specific. If you combined in various ways expectations based 
on bit strings of length 1,2,3,...n then you are combining the more general with the more 
specific allowing general and specific responses to be unified. It sounds like a good 
enough idea to actually code it up and give it a try. 


